|
Truncated binary encoding is an entropy encoding typically used for uniform probability distributions with a finite alphabet. It is parameterized by an alphabet with total size of number ''n''. It is a slightly more general form of binary encoding when ''n'' is not a power of two. If ''n'' is a power of two then the coded value for 0 ≤ ''x'' < ''n'' is the simple binary code for ''x'' of length log2(''n''). Otherwise let ''k'' = floor( log2(''n'') ) such that 2''k'' ≤ ''n'' < 2''k''+1 and let ''u'' = 2''k''+1 - ''n''. Truncated binary encoding assigns the first ''u'' symbols codewords of length ''k'' and then assigns the remaining ''n'' - ''u'' symbols the last ''n'' - ''u'' codewords of length ''k'' + 1. Because all the codewords of length ''k'' + 1 consist of an unassigned codeword of length ''k'' with a "0" or "1" appended, the resulting code is a prefix code. ==Example with ''n'' = 5== For example, for the alphabet , ''n'' = 5 and 22 ≤ ''n'' < 23, hence ''k'' = 2 and ''u'' = 23 - 5 = 3. Truncated binary encoding assigns the first ''u'' symbols the codewords 00, 01, and 10, all of length 2, then assigns the last ''n'' - ''u'' symbols the codewords 110 and 111, the last two codewords of length 3. For example, if ''n'' is 5, plain binary encoding and truncated binary encoding allocates the following codewords. Digits shown It takes 3 bits to encode ''n'' using straightforward binary encoding, hence 23 - ''n'' = 8 - 5 = 3 are unused. In numerical terms, to send a value ''x'' where 0 ≤ ''x'' < ''n'', and where there are 2''k'' ≤ ''n'' < 2''k''+1 symbols, there are ''u'' = 2''k'' + 1 − ''n'' unused entries when the alphabet size is rounded up to the nearest power of two. The process to encode the number ''x'' in truncated binary is: If ''x'' is less than ''u'', encode it in ''k'' binary bits. If ''x'' is greater than or equal to ''u'', encode the value ''x'' + ''u'' in ''k'' + 1 binary bits. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Truncated binary encoding」の詳細全文を読む スポンサード リンク
|